2rKl IEEE Workshop on Interactive Voice Technology for Telecommunications Applications (I VTTA94) 



Kwte>.JwM 
2it-27.in4 



A MULTIMODAL CONSUMER INFORMAHON SERVER WITH IVR ME^aJ 

Martin Damhuis, Marc Peeters, Louis Boves 
Royal PTT Netherlands N. v.. PTT Research 
St Pautusstraat 4, 2264XZ Leidscbendam 
The Netherlands 



ABSTRACT 

This paper describes the development of a fully automatic 
muitimoda] infomation system for the consumer maiket. The 
system will be able to provide infonnation on a large number 
of topics via a single telephone number. The eventual system 
will integrate Interactive Voice Response, speech recognition, 
speaker verification. Direct Dial In, Calling line 
Identification, facsimile and electronic mail. The present 
version is limited to DTMF input and voice and facsimile 
output. The architecture of the system described in this p&pcx 
allows successive additian of other technologies. 

I.INTROPUCTION 

A consumer Infonnation Server must be suitable for use by 
the general public. This means that the system must fulfil 
certain demands: 

• Information should be interesting for a large group of 
potential users. 

• This server should add vahie to existing services. 

• Services should be easy to understand and simple to use. 

• All services should be fully automatic. 

• All services should have a unifavm aiul user firiendly 
interface. 

The present version of the information server thai we are 
developing employs an IVR menu with DTMF input as the 
means by whidi the callers can interact with the system and 
selectively retrieve information items from a database. Two 
applications are described in detail, viz. tele-shopping for 
consumer telecommunication pfoducts and a service giving 
infonnation about consumer tests of telecommunication 
hardware. Both services use speech synthesis and facsimile to 
deliver the requested information to the caller. The 
information in the system can be updated remotely, partially 
by means of a PC and a network, partly via aru>ther IVR 
application that allows the system administrator bo record 
new voice prompts and to add new information to the 
system's database. The following sections deal with the 
system configuration, a description of the infonnation server 
and the services that are already implemented; section 4 
describes a number of developments which are plinned for 
the near foture: the use of advanced techniques like Calling 
Line Identification (CU), speech recognidon. speaker 



verification and text to speech synthesis will allow us to 
improve the infonnation server. 

IL SYSTEM CONnGURATION 

2.1 System architecture 

The system architecture of the muitimoda] consumer 
infoimaiion server is based on the concept that is visualised 
in Figure 1. 




Figure 1: concept of multimodat information serter 



This concept distinuishes between a number of abstract 
platforms that are needed to give a consumer access to 
information server 

• The input pUtform makes it possible to order 
information. 

• The output platform makes it possible to receive and 
perceive infonnation. 

• The network platform makes it possible to transmit 
infonnation from one point to an other. 

• The server platform provides infomution. 

A service description, made according to this concept, is 
independent of the chosen platform. The input and output 
platform are not necessarily identical. For example, a 
consumer orders the informatton via an interactive voice 
response menu and the mfonnation is delivered via electronic 
mail. In this example the telephone is irqntt platfonn and the 
PC is ouqmt platform. This makes the consumer information 
service multimodal. 

The system architectuic of the multhnodal consunner 
information server is given in Figure Z 
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Fig 2: system architecture of multimodal informasian server 

The application according (o the system aichitectuic can 
provide a fully automatic service end to end. 

2.1J Input platform 

The input pUtfonn of the present vcnion of the system is a 
telcfAone which is used to access, select and perhqia order 
infoimatkm via on IVR menu, over the Public Switched 
Telephone Network (PSTN). Telephone access to an 
autom adc information server has several advantages: 
telephones are always easy to find, even when a customer is 
away &om home or office. If telephone input is limited to 
recognition of DTMF codes, it is a very unattractive medium 
for inputting large amounts of infonnation. Yet, complex 
applications may need much input For this reason, wc 
envisage the use of speech recognition in the infonnation 
ordenng process instead of or in addition to the keypad of the 
telephone with DTMF tones. 

The output during the interactive voice response menu is of 
course restricted to speech (or audio) because of the 
telephone. 

2JJ Ouput platform 

There is a range of output platforms; the sdection depends on 
which of the platforms available to the consumer is most 
appropriate: 

• The telephone can be used as output platform, so the 
infomiation is given via the same platform at the input. 
This is useful if the message is short. 

• Post as output platform is. like the telephone, a platform 
accessible to everybody. Hie post is a powerful platform 
tt) receive information, but it needs some time to deliver 
and the delivery address is not easy to provide via an 
IVR menu with DTMF input 

• Electronic mail is a to deliver information 
automatically and in several different multimedia 
formats. However, few consumers have access to 
electronic mail and there is also a problem for the 
consumer to provide the mail-address via an IVR menu 
with DTMF input. 

• The fax is an output platform that is useful to deliver 
text and graphics. This way of information delivery can 
benefit from increasing pc^arity of the consumer fax. 
The fax number of the consumer is easy to provide in an 
IVR menu with DTMF input 



To implement the output platform in our system an existing 
platform is used that provides multimodal delivery via fax 
and mail, viz. Europub [1]. At the end of a successful query 
menu an electronic mail message is send to the Europub 
platform, which takes care of the delivery in the mode 
requested by the caller. 

22 System Hardware 

llie workstation used for realising the input-platform is based 
on a PC tutming the OS/2 operating system, a Rhctorex Voice 
Card and driver software. The Voice Card has a four channel 
analogue telephone interface. The platform uses system 
messages encoded in a 4-bit AOPCM format It can be 
connected to other cards via an ABC connector compatible 
with the MVIP interface. Other technologies can be added to 
the system fay connecting facsimile cards and speech 
recognition boards to the voice card via this MVIP micrface. 

23 Senice Creation Tools 

For creating speech interactive services it is important to have 
a software tool that enables a service developer co create 
custom applications rapidly and which additionally provides 
system management fimctions. En this project we used the 
Show N Tel application generator which is a 4GL tool with a 
point and click icon based graphical user interface. 
Applications can be developed quickly and emerging 
standards with regards to menu interaction can be 
incorporated into the tool, allowing services to have a 
consistent "look and feel". Together with the Rhctorex 
software this application generator software mchides tools for 
service devcl<^imcnt voice prompt recording and editing 
tools, database functionality and support, management 
information and maintenance tools. 

24 Use of PTT Telecom Style Guide 

An important aspect of designing IVR menus is that all 
services offered by PTT Telecom should have the same look 
and feel*. IVR menus are not always easy to design and it has 
been found that users quickly arc discouraged from using 
systems that arc difficult to understand or tedious to use. A 
style guide is essentially a set of rules that will help 
develtqjers to create user friendly IVR menus. A consistent 
style includes using standard phrasing and arrangement of the 
menu structure to ensure that the consumer will find learning 
how u> use a new system much easier. The applications that 
are part of the Information Server have been built in 
conformance with the style guide adopted by PTT Telecom, 
This style guide for menus has been developed at PTT 
Research. It is partly based on previous work of standard 
bodies such as the European Telecommunication Standards 
Institute (ETSI), and associations such as the Voice 
Messaging User Interface Fomm (VMUIF). Tht final style 
guide was made by evaluating human factors issues of ■ 
existing automated telephone services in the Netherlands and 
by performing usability tesu of prototypes of new voice 
applicaiions (menus with DTMF or speech rBcognition) that 
were developed at PTT Research. The features of the PTT 
Telecom style guide that are most important fior the 
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developinenl of the mfonnation saver «re summariBed in 



Table 
Table 


1: Specific features of FTT Tdeoom style guide 


Key 


Functions 


1 


menu choice, yes 


2 


menu choloe, no 


7 


jto back (only used when * is not possible) 


9 


main menu 


0 


request help, operator fallbadCt entrance test 


* 


go back, abort entry, repeat 


« 


Ko ahead, terminate entry, interrupt 




type ahead must be allowed at mentis and data entiy 


each option is foUowed by its action number 


menu ttnicture: max. four choices and three layers deep 
(except for sequential entry of data) 



m. OVERVIEW INFORMATION SERVER 

There is a broad range of applications that can be offered 
with die system oonfiguntion. In this secdan two promising 
qyplications are described that fulfil the demands that are 
mentioned in the introduction. Eventually, the applications 
can be accessed via three alternative charging methods: 

1. Two or more separate telephone numbers, each with an 
appropriate charge. 

2. One tclefdume number that guides the caller through an 
rVR menu, where the choice can be made between 
different applications with appropriate tariffs. Presently, 
this service is not yet available in the Netherlands. 

3. The Soopecaid number* which can be provided via 
DTKfF input and that allows cfaaigiitg the caller for die 
cost made accessing the information service. 

The following paragraphs describe two qiplicaltons that are 
implemented for the Dutch consumer market, tht two 
applications can be accessed via one telephone number 
(alternative 2) and the charging is done via the Scopecard 
number. The only relationship that these applications have is 
that they are offered by one company. 

3.1 Ttleshopplng service fbr the consumer 

teleoommunkatloiis products 

PTT Telecom offers two ways of buying products: 

• Via shops in major cities. 

• Via human telephone operators. 

A new way of selling producu is fully automatic 
teleihopping. By providing consumers with brochures 
containing information about dte poctfblio, product ordering 
numbers and a special telephone number a tdeshopping 
service for the consumer telecommunications products is 
created. In the automatic service three different steps have to 
be taken in the tVR menu to order some products; 

• Ordering 

• Identificadon 

• Oder verification 



Ordering 

In the IVR menu the caller can get an overview of the 
products that are offaed (whkh can be send to him by fox or 
post); moreover, one has the opportunity to order products. 
To Older die products the ordering numbers must be provided 
by DTMF input The IVR menu gives the caller the 
opportunity to order several products or to end the ordering 
process. At the end of the ordering process the caller hears 
the list of ordered products and the total costs. If the total cost 
is below a given threshold, the system wUl advise the caller 
that the order is rejected, unless addidonal producu arc 
ordered. 
Ident^ation 

Identificatkm of the caller is done via the Scopecard. The 
Scopecard number and a FIN has to be entered. The payment 
of the bill can be also done with this unique number. 
Order verification 

If the caller wants a conformadon of the order he/she can 
indicate this at the end of the IVR metui; one can also select 
the way in which the conformation should be forwarded 
(mail, post or fax). DTMF tones are used to identify the fax 
number. In the future the Scopecard number can be used to 
the identify post address, the mail address or the fox number 
of the caller. The information of the caller and his/her order 
is put into a data-file which is sent to a remote system via 
electronic mail This fde contains information like a persorud 
ID (the Scopecard number), the way the informatian has to be 
delivered (fax. electronic mail or paper mail), and an ID of 
the ordered products or requested infamation (e.g. the list 
with the overview of the producU that can be ordered). 

3.2 Consumers' Hardware Test Information Sendee 
Consumen often have the need for an independent opinion 
about certain products they want to purchase. For example 
you want to buy a (new) cordless telephone set, something 
you do cncc every 10 years, and yw want the latest test of an 
mdependent arganisation. The consumer information server 
can be the key to the latest test of cordless telephone sets and 
provide you with the inform lUicn via voice response, 
facsimile or (post/electronic) mail. 

In the IVR menu two different steps have to be taken to find 
and order the infcnnation: 

• Identification of the infotmation 

• Infonnation retrieval identification 

Identification of the ittformation 

The tVR menu will fint of all give an overview of the 
Consumers Hardware Test Information that is available. Due 
to the lettgth of the list a skip and scan is tmplemented. The 
caller can also ask for a list of the available tests, which can 
be sent to the caller by fax or post 

To allow the caller to go faster through the long list of items; 
a skip and scan method is implemented. Via the IVR menu 
the caller can find his wi^ through the information desired. 
The application wDl read a short oonchision of the selected 
test report The caller can then decide to stop or request more 
detailed infonnation about the test. 
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Ififonnation retrieval identification 

If the consumer does want more detailed information about 
the tests he/she must indicate this and select the way in 
which the tnfbnnation should be sent (mail, post or fax). 
fJTMF tones are used to identify the fax number. In the 
future the Soopecard number will be used to identify post 
address, the electronic mail address or the fax number of the 
consumer directly. 

23 Remote update. 

An information or service pravLder will require some means 
of updating the information on the system. This can be done 
locally on the actual system but it is often more convenient to 
have the ability to do this remotely. Some of the infoimation 
in the system can be updated remotely with a another IVR 
application that allows the system admmistrator to record 
new prompu. To update the infomiation that has to be sent 
by e.g. facsimile, an iqxUte service by means of a PC and 
modem or PC via LAN is a possibility. This allows 
documents to be sent in a variety of formats, e.g. ASCII text, 
I^KtScript or one of the standard PC fornutu e.g. facsimile 
fonnat. 

IV. FUTURE DEVELOPMENTS 

Probably the most attractive feature of the advanced 
Information Server under development is that it provides 
access to a rich set of services via a single telephone number. 
However, the advantage of having to remember only a single 
number creates the disadvantage of a more complex service 
selection. Thus, there is a need to replace the selection menu 
with a mixed-initiative dialogue. Advanced features of the 
telecommunication network can also be used to enhance the 
service quality. Ihe system architecture, based on a number 
of abstract platforms, allows easy integration of new 
technologies. 

4.1 Speech Recognition 

Work is under way to offer automatic speech recognition 
(ASR) as an additional input platform. In iu first venton 
ASR will emulate DTMF recognition, making the service 
accessible from rotary dial phones. The next step will 
improve menu navigation by allowing the caller to speak the 
names of services and products instead of giving numerical 
codes. Eventually we envisage the replacement of the menu 
interface with a mixed trutiative dialogue interface, using the 
speech and NLP technology which is now under devebpment 
in collaboration with Philips Research in Aachen. ASR will 
also solve the problem of entering caller address infoimation. 

4.3 Speaker Verification 

Mail ordering is a traiuaction process that requires a certain 
anunint of protection against fraud. In the present version 
fraud protection hinges on the FIN code of the user's 
Soopecard. It is well known that this code is easy to steal. 
Thus, we intend to introduce speaker verification techniques 
to reduce fraud. This will be done in dose collaboralian with 
Card Services, taking advantage of the fact that mudi work 



on the coupling of databases has already been done for 
allowing billing via the Scopecard. 

4J CalUng Line IdeatlflcatloD 

PTT Telecom wilt introduce CU in the network in the near 
future. Our service will use CU data as an extra level of 
security: if the caller's rtame tsvAlm address do not rruitch the 
database infocmatkm a special verification dialogue will be 
entered. The interaction with residential callers will be 
simplified fay verifying database address (and billing) 
informadon, instead of asking the caller to provide that 
information from scrat^. 

V. CONCLUSIONS 

This paper has described the concepts of a consumer 
information server and has outlined the possibilities of such 
systems. The addidon of facsimile and other ouqjut facilities 
to interactive voice response services enables a variety of new 
applications that require high density information to be send 
to the consumer. 

Careful attention to the design and style of voice response 
menus will help to increase the use of these services, make it 
easier and quicker to d6vek)p new services and provide 
greater customer satisfaction. 

Future enhancements like speech recognition and speaker 
veriftcatton wOl make mcn-e applications possible and will 
help to increase the quality of available services. 
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* The Soopocud is lucful to lelqihone tbraad and receive the bill 
later through an national bank aoxiuni. 
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